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Abstract 

The Rayleigh product channel model is useful in capturing the performance degradation due to rank 
deficiency of MIMO channels. In this paper, such a performance degradation is investigated via the 
channel outage probability assuming slowly varying channel with delay-constrained decoding. Using 
techniques of free probability theory, the asymptotic variance of channel capacity is derived when the 
dimensions of the channel matrices approach infinity. In this asymptotic regime, the channel capacity is 
rigorously proven to be Gaussian distributed. Using the obtained results, a fundamental tradeoff between 
multiplexing gain and diversity gain of Rayleigh product channels can be characterized by closed-form 
expression at any finite signal-to-noise ratio. Numerical results are provided to compare the relative 
outage performance between Rayleigh product channels and conventional Rayleigh MIMO channels. 

Index Terms 

Central limit theorem; finite-SNR diversity-multiplexing tradeoff; free probability theory; MIMO; 
outage capacity; Rayleigh product channels. 

I. Introduction 

Multi-Input Multi-Output (MIMO) wireless communications have received considerable attention since 
it is seen as the most promising way to increase link level capacity. Extensive works have focused on the 
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performance of MIMO channels assuming a rich scattering environment. Therein, the presumed models 
include full-rank independent Rayleigh or Rician MIMO channels. However, in certain environments 
the propagation may he subject to structural limits of fading channels caused hy either insufficient 
scattering |[T|, Q or the so-called keyhole effect |j^. These channels exhibit rank deficiency compared fo 
fhe independenf Rayleigh and Rician models. The MIMO model fhaf capfures fhese effecfs is referred fo 
as fhe double-scaffering channel Q. If is characferized by a mafrix producf involving fhree deferminisfic 
mafrices (i.e., fransmif, receiver, and scafferer correlafion mafrices), and fwo sfafislically independenf 
complex Gaussian mafrices. In a fypical office environmenf, empirical measuremenfs have been used fo 
demonsfrafe fhe validify of fhe double-scaffering channel model Q. 

There exisf a number of sfudies concerning fhe informalion-lheorelic quanfifies of fhe double-scaffering 
channels. Shin et. al. derived an upper bound for fhe ergodic capacify ||^ Th. III.3] and an exacf expression 
for a single keyhole channel l|^ Th. III.4]. The diversify-mulfiplexing Iradeoff of fhe double-scaffering 
channel was obfained in Q. The aufhors in Q invesfigafed fhe asympfofic Rayleigh-limif when one 
of fhe mafrix dimensions approaches infinify. In fhis case, fhe double-scaffering model reduces fo an 
equivalenf Rayleigh MIMO channel. Furfhermore, if all mafrix dimensions are large, fhe ergodic capacify 
has been obfained in Q via numerical infegrafion. Recenfly, an asympfofic expression for ergodic capacify 
of fhe double-scaffering channels was derived in Q. Moreover, aufhors in ||^-|[TT| derived fhe ergodic 
mufual informafion for finife dimensional channel mafrices. However, all fhe above resulfs are valid for 
ergodic channels, where each codeword has infinite lengfh. For many pracfical communicafion sysfems 


such as WLANs 1121, fhe channels, albeif random, are slowly varying and fhe encoding/decoding process 
is subjecf fo a delay consfrainf wifh moderate fargef packef error rates around The fading 

channel seen by each codeword are Iherefore non-ergodic. In fhis case, fhe ergodic capacify has no physical 


significance, whereas fhe oufage capacify is a more relevanf performance mefric |13|. In liferafure, fhe 


oufage capacify has been sfudied for convenfional Rayleigh MIMO channels 1141-| 201 as well as for 
Rician MIMO channels |[^-|[2^ via various random mafrix fechniques. 

To fhe besf of our knowledge, fhe oufage capacify for fhe double-scaffering channel has nol been 
addressed in fhe mosf general forrrj^ If fums ouf fo be a difficull random mafrix fheory problem. To 
gain insighfs info fhe oufage behavior of fhe double-scaffering channel, we consider a simplified channel 


'Note that authors in |l8) derived the outage prohahility of a similar MIMO model with random steering matrices at antenna 
arrays. However, these steering matrices are slowly varying compared to multi-path fading and considered as deterministic. Thus, 
the tools in |18| are not applicable here. 
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model involving a product of two statistically independent complex Gaussian matrices, also known as the 
Rayleigh product channel. This channel model corresponds to the scenario where the antenna elements 
as well as the scattering objects are sufficiently separated and there is no spatial correlation at antenna 
arrays or between scatterers. To characterize the capacity fluctuations, we use the free probability theory 
for large dimensional random matrices \24\- |2g. By utilizing the second order Cauchy transform and 
i?-transform machinery, we derive a compact expression for the asymptotic variance of the capacity of 
the Rayleigh product channel. We further show that the channel capacity distribution is asymptotically 
Gaussian by proving a Central Limit Theorem (CLT) for the Linear Spectral Statistics (LSS) of the 


Rayleigh product ensemble. This result generalizes the CLT for correlated Wishart random matrices 1271 
and the CLT for Rayleigh product ensembles from polynomial LSS to generic analytic functions | |28| . 
The capacity distribution is then utilized to study the corresponding finite Signal-to-Noise-Ratio (SNR) 
Diversity-Multiplexing Tradeoff (DMT). The derived results in this paper are formally valid when the 
dimensions of the channel matrices grow to infinity. However, numerical simulations show that they serve 
as good approximations when the numbers of antennas and scatterers are comparable to practical systems. 

The rest of the paper is organized as follows. In Section we give the channel model, the signal 
model as well as the MIMO capacity formulation. In Section we study the second order eigenvalue 
fluctuations and the asymptotic capacity variance. The second order Cauchy transform of Rayleigh product 


ensembles is derived in Section IV The CLT of the capacity of Rayleigh product channels is proved in 
Section |V] Based on this result, the approximations for outage probability and the finite-SNR DMT are 
calculated. In Section |V^ we conclude the main findings of the paper. Proofs of the technical results are 
provided in the Appendices. 

Notations. Throughout the paper, vectors are represented by lower-case bold-face letters, and matrices 
are represented by upper-case bold-face letters. The complex vector field with length n is denoted as C”. 
We use CAf{0, A) to denote the zero-mean complex Gaussian vector with covariance matrix A and I„ 
is an n X n identity matrix. The superscript (•)1^ denotes the matrix conjugate-transpose operation and 
(•)"'" is matrix transpose. We denote (•) as the complex conjugate operator. Denote Tr(A) as the trace of 
nxn matrix A and tr(A) as the normalized trace Tr(A)/n. The notation E[-] denotes the expectation, 
and det(-) denotes the matrix determinant. 
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R receive antennas _ T transmit antennas 



Fig. 1. MIMO communications over the Rayleigh product channel with T transmit antennas, R receive antennas, and S 
scatterers. 


II. Rayleigh Product MIMO Channels 


A. Channel Model 


Consider a discrete-time, baseband MIMO system with T transmit and R receive antennas. The channel 
is assumed to follow the Rayleigh product fading with S scattering objects, as shown in Fig. [T] The 
channels between the s-th scatterer and transmit antennas are denoted by vector Os = [0si,..., Ogx], and 
the channels between receive antennas and the s-th scatterer are denoted by vector j • • ■, ’’Psr] ■ 

The end-to-end equivalent channel matrix H is given by 








( 1 ) 


where © = 


o\,...,e 


it 


and = 


■01,..., 0^ . We assume Og ~ CAf{0, It) and 0^ ~ CAf{0, Ir), 
where 6i and 0^, 1 < < 5, are statistically independent. The channel H is thus modeled as a 

product of two independent complex Gaussian random matrices. In line with Q, |[8|-|[TT|, the channel 
H is normalized by the constant 1/y/RS so that the total energy of the channel is equal to an AWGN 
channel with an array gain E[Tr(HH^)] = Xli j 

The presence of independent Gaussian matrices 0 and 'J' in ([T]) requires two independent and richly- 
scattered environments, where the scattering happens between the S scatterers/keyholes and transmit and 
receive arrays, respectively. This requires the existence of a large number of independently reflected 


A 


and scattered paths around the antenna arrays \29\. The two environments are connected only via the S 
scatterers/keyholes. By controlling the number S, the Rayleigh product channel ([T]) embraces a general 
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family of MIMO fading channel, spanning from the degenerate keyhole channel 5 = 1 Q to the full- 
rank Rayleigh MIMO channel 5 —)• oo with fixed R and T Q. Muller and Hofstetter Q have shown 
that the number of significant scatterers is around ten in a typical office building with an 8 x 8 antenna 


configuration. Measurement results in |30| indicate that the effective rank of a 6 x 6 keyhole channel 
depends on the sizes of scatterer/keyhole at different transmission frequencies. In general, the number of 
separable scattering objects depends on the number of antenna elements since a larger array increases the 
spatial resolution Q. Note that the model Q also describes the MIMO relay channels when assuming 


noiseless relays |311. 


B. Signal Model and Channel Capacity 

The channel output vector y G C^, at a given time instance, equals 

y = Hx + n, (2) 

where x G is the transmit vector that follows the complex Gaussian distribution x ~ CAA(0, S) with 
X) = E[xxt]. The additive noise n G is modeled as an Ltd. complex Gaussian vector n ~ CAA(0, li?). 
In this paper, we have adopted the following assumptions: 

Al) The Channel State Information (CSI) is perfectly known at the receiver but not at the transmitter. 
A2) The channel H is frequency fiat and quasi-static. It remains constant for certain symbol durations 
and takes independently a new value for each coherence time. 

A3) Delay-constrained encoding/decoding. The encoded transmit message has a finite block length and 
spread in time over no more than a maximum allowable decoding delay. We assume the length of 
a coding block is equal to one independently faded interval. 

Under Telatar | |32| has shown that the channel capacity is achieved when the transmitted symbols 
are independent across antennas and the power is equally allocated, i.e. S = and 7 denotes the 
Signal to Noise Ratio (SNR) per received antenna. The instantaneous capacity of the MIMO channel Q 
in nats/sec/Hz is given by 

R 

I = log det (1r + ^ log(l -f yXi), 

i=l 

where A*, f = 1,..., i?, refer to the eigenvalues of Q = HH^^. For the Hermitian matrix Q, we find it 
convenient to introduce the Empirical Spectral Distribution (BSD) defined as 

_ 1 ^ 
i=l 
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where l(-) denotes the indicator function. By letting ip{x) = log(l + 7 x), the channel capacity X can 
he rewritten in terms of Xq(A) as 

I = Rjip{X)dF^iX). (3) 

As the channel matrix H is random, the instantaneous capacity ([^ is also a random variahle. Without 
CSI at the transmitter, there is a non-zero probability, independent of the code length, that the channel 
capacity Q falls below any positive rate. Due to the assumptions and A|^ the error probability 
corresponding to this rate cannot be decreased exponentially with the code length m. In this case, no 
reliable transmission is possible and the performance cannot be evaluated using the ergodic capacity. 
Instead, the fundamental performance limit of such a system is best explained with the capacity versus 
outage tradeoff, characterized by the Cumulative Distribution Function (CDF) of X. Given a fixed rate r, 
the outage probability is defined as fhe probability that capacity X is less than r, i.e. 

Pont{r) = Pr{X < r} = Xx(r), (4) 


where Fx{-) denotes the CDF of X. When the CDF Fx{-) is monotonically increasing, the outage capacity 
for a given probability Pout is obtained as 

Xout = Fj- (Pout)- 


The outage probability Q is achievable in the sense that for any e > 0, there exists a code of 
sufficiently large block length for which the packet error rate is upper-bounded by Pout(r)-|-e. Thus, outage 
capacity provides useful insights on the operational performance of a delay-constrained coded system. 
Outage probability is also a meaningful metric to characterize the performance of some contemporary 


communication systems 134|, where timely CSI is available at the transmitters. From this viewpoint, 
the complementary outage probability 1 — Pout(?’) can be interpreted as the percentage of time that a 
transmission takes place at given rate r under perfect link adaptation. 


III. Statistics of Channel Capacity 

In this section, we first review the convergence of empirical eigenvalue distribution of the Hermitian 
matrix Q = HHt when matrix dimensions grow to infinity. The capacity per receive antenna is shown 
to converge to a deterministic value and expressed by a known result in Q. Then, we study the global 
fluctuation of eigenvalues around the limiting distribution by deriving a closed-form expression for the 
the second order Cauchy transform of Q. This result is utilized to obtain the asymptotic variance of the 
channel capacity. 
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A. First Order Cauchy Transform and Asymptotic Capacity 

For an X Hermitian random matrix A, we assume that its BSD converges to a non-random 


limiting distribution Fa(-) as A —)• oo. Such a convergence is alternatively established in [351 via the 
convergence of resolvent Qa{z) to the first order Cauchy transforrr[^ ^a(' 2)> defined as 

1 


Ga{z) =tT{lNZ- A) = 

1 


z — t 


dFAit), 


Ga{z) = 


-dFAit). 


( 5 ) 


Js^z-t 

Here, 2 : G C"'' = {z : Im( 2 ;) > 0} and 5a denotes the support of Aa(-)- Due to this limiting behavior 
of eigenvalues, the normalized linear spectral statistics, such as normalized capacity T/R, converges to 
a non-random limit as A 00 for a wide class of matrix ensembles Q, | fT4l , 


In the following, the limit lim denotes the asymptotic regime, 

R-^oo 

S T 

T, S, 00 , with P = and C = t; fixed. (6) 

R. o 

In the asymptotic regime Silverstein i jj^ shows that the BSD Aq(-) converges almost surely to a 
non-random CDB FQi-) and its Cauchy transform Qq is the solution to 

1 f AdFp(A) 


^ Jsp 1 ’ 


( 7 ) 


where P = 00i/5 and Fp (•) is the well-known Marcenko-Pastur distribution |37|. The integration 


range in Q is [(1 — s/C)'^, (1 -|- x/C)^]- Using multiplicative free convolution, Muller has shown in Q 
that GQiz) -A Gciiz) in the asymptotic regime ^ and Gq satisfies the cubic equation 

z'^GQiz) + ipC + p — 2)zGQiz) + ((pC “ l)(p “ 1) “ Pz)Gq{z) -|- p = 0. (8) 

If 1 2 ;I 00 , ^q(-) admits the formal power series expansion 

00 

GQiz) = (9) 


00 

= V anZ 

n=0 

where ao = 1 and an is the n-th free moment of Q, defined as 


an = lim E [tr(Q"')] . 

R^oo 

A concept closely related to the Cauchy transform GQiz) is the i?-transform 7Ziz), defined as a func¬ 
tional l l^ 

Gq (niz) + l)=z. ( 10 ) 


^In what follows, we refer to the first order Cauchy transform simply as Cauchy transform unless otherwise stated. 
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oo, 7 ^( 2 :) has the formal power series representation 

00 


K„Z 


n—1 


(11) 


n=l 


where Kn is the n-th free cumulant of Q. As will he shown in Section IV the free cumulant sequence 
{Kn}n>i and its generating function Tl{z) serve as the key analytical tools in the proof of Proposition]^ 
Note that when the matrix dimensions are finite, the computation of an involves non-trivial summations 
over all partitions of integer n | [3^ Eq. (27)]. This complicated expression makes it challenging to 
obtain an explicit expression for the free cumulant As the matrix dimensions approach infinity, the 
calculation of is much simplified, involving only the so-called non-crossing permutations over integers, 
see Lemma [T] and Appendix for a detailed discussion. 

Using random matrix theory techniques, authors in 0 prove that the capacity per receive antenna of 
the Rayleigh product channel converges to an asymptotic limit such that lim (X — /xx) /R = 0. The 

R-^00 

asymptotic capacity /xj/i? is given hy an explicit closed-form expression, which is summarized in the 
following proposition. 


Proposition 1. (Asymptotic capacity When R = T, the asymptotic capacity per receive antenna px/R 
in the regime ^ is given by 

where g is the unique solution to 

- (1- p)g‘^ + -(g-l) = 0 
7 

such that (1 — g)/{g{g -|- p — 1)) > 0. 


Although the asymptotic capacity px grows to infinity in the asymptotic regime Q, it serves as a tight 
approximation to the mean capacity E[X] with finite matrix dimensions as shown in |[^. In the following, 
we will also use px as the approximated E[X] whenever it is clear from the context. 


B. Second Order Cauchy Transform and Asymptotic Variance 

As Xq(-) —)■ Fq{-) in the asymptotic regime ([^, the asymptotic capacity per-received antenna px/R 
can he formulated hy replacing Xq(A) in (j^ with Xq(A). Using an integral identit}]^ ^n\ Eq. (1.14)], 

^The definition of Stieltjes transform in |27| is different from the Cauchy transform hy a minus sign. 
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we utilize an amenable form of asymptotic capacity, which is useful in the following discussion, namely 

^ = [ V^(A)dFQ(A) = V’{z)QQ{z)dz. (12) 

J Sq ^ c 

The complex integral on the right hand side of ( [T2] ) is over any positively oriented closed contour C 
enclosing the support Sq and on which (p{-) is analytic. For the instantaneous channel capacity (|^, there 
exists a similar integral expression as in (p^ with Cauchy transform Qq{z) replaced with the resolvent 


Qq{z). To see this, let the contour C be selected according to (121 and apply Cauchy’s integral formula 
on ip{\), it follows that the instantaneous capacity (j^ becomes 


1= /'v(A)dFQ(A)=i 


X — X 


dx dFQ(A). 


Exchange the integrations and recall the definition of resolvent (j^, we obtain 


(13) 


Let us now consider the variance of capacity X, defined as Uj = IE[(X — E[X])^]. Replacing X with (|l^ 
and E[X] with (121, the variance cjf can be rewritten as 


® (p{x)GR{x)dx 

® ^{y)GR{y)dy 

Jc^ 

fCy 


where Gji{x) = R ^^q(x) — Qq{x)^, the contours Cx and Cy are non-overlapping and are taken in the 
same way as in After interchanging the expectation and integrations, we have 


o-f = ‘p{x)<piy)CoY (Gr(x), Gi?(y)) dxdy, 

Cx •fC'u 


(14) 


where Coy {Gji{x), Gji{y)) = E [Gh(x) • GR{y)] is the covariance function of matrix resolvent scaled 
by the matrix dimension R. In the context of free probability theory, this covariance function is known 


as the second order Cauchy transform |26| and is denoted as Coy {Gr{x), G^{y)) = GQ{x,y). 

The rest of this section is devoted to derive the second order Cauchy transform of Q and the asymptotic 
variance of capacity erf by using a recent result from free probability theory. Namely, by the framework 
of the second order freeness [261, |j^, the second order Cauchy transform QQ{x,y) exists if Q has a 


second order limiting distribution according to the following definition: 

Definition 1. Let A]\f be an N x N random matrix. 'We say that it has a second order limiting distribution 
if for all m,n > 1 the moments {an}n>i cind the limits 


am,n= lim A:2(Tr(A)!^),Tr(A^)) 

N^oo 
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exist and if for all r > 3 and all n(l),..., n(r) > 1, 

kr (tt ,..., Tr = 0, 

where kr denotes the r-th classic cumulant. 


As 00^/S and are independent complex Wishart matrices, they are unitarily invariant and 

their second order limiting distrihutions exist | [40| Th. 3.5]. It follows from Eq. (29)] and the cyclic 
invariant property of matrix trace that Q also has the second order limiting distrihution. The second order 
Cauchy transform QQ{x,y) is given hy the functional Eq. (53)] 

Gci{x,y) = GQ{x)gQ{y)n{QQ{x),gQ{y)) + 

where TZ{x,y) denotes the second order i?-transform of Q. Similar to the first order case, if |x| —)> oo 
and lyl oo, ^Q(x,y) and Tl{x,y) have formal power series representations 

gq.{x,y)= ^ am,nX~'^~^y~'^~^, and TZ{x,y) = ^ (16) 

m,n>l m,n>l 


In literature, the covariance function gB{x,y) for the Wishart type N x N random matrix B = 


(1/A)X^TX has been studied in [271, where T is a non-random Hermitian matrix and X is a Gaussian 
like|^ random matrix with i.i.d. entries. Therein, the correlation function of B has the form 


gB{x,y) = 


{Gb{x) - g-Biu))'^ yy 


(17) 


and it is subsequently used to derive an asymptotic variance of Rayleigh MIMO capacity in |16|-|18|. 
Note that the second term of the right hand side of (151 is exactly the same as ( [TT] ) by replacing B with 
Q = (1/7?)^^P^ and assuming P = 00f /S non-random. Therefore, the fluctuation of capacity Uj of 
Rayleigh product channels has a distinct functional structure from the Rayleigh MIMO channels, see ( [T4] ) 
and ( [Is] ). The increased fluctuation is due to a non-zero i?-transform Tl{x,y). Closed-form expressions 
of Gci{x,y) and Tl{x,y) are summarized in the following proposition: 


Proposition 2. The second order Cauchy transform of Q is given by ( |i5| ) with 

TZix y) = _ ____1_ (jg) 

xV(ap(l/x)-gp(l/y))2 {x-yr 

where ^p(z) is the Cauchy transform of a Marcenko-Pastur distribution with the parameter Q as in ^ 




1 + C (1 - CF 


2z 




(19) 


4 


Each entry of the Gaussian like matrix has the same second and fourth moments as a Gaussian random variable. 
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Proof. The proof of Proposition depends on the comhinatorial structure of cumulants {Kn}n>i and 
{K-m,n}m,n>i and is given in detail in Section □ 


as 


Substitute (151 into (14) and denote G{x) = Q-p{l/Q(^{x)), then the asymptotic variance uf is rewritten 

p{x)ip{y) 


2 1 
(Tj — 


4vr2 (G(x) - G{y)f 


dG{x)dG{y). 


( 20 ) 


In the general setting, it is difficult to further simplify the double integral (20). However, when the 
transmitter and receiver have equal number of antennas, i.e. C, = l/p, & compact expression for cjj can 
be obtained. The results are summarized in the following proposition. 


Proposition 3. When R = T, the asymptotic variance is given by 

2 , „ 7(Wr - 1)^ 

= log -- 2^0 

7 - U}^[2u}r - 2) 

where < 0 is the solution of the cubic equation 

+ (1 - 7 + yCjt + 7 = 0. 
Proof The proof of Proposition is in Appendix 


( 21 ) 


( 22 ) 


□ 


Using Cardano’s formula to solve the cubic equation 


where 


n(7,C) = 


2 37 C - 37 - 1 u(7,C) 

' 3 3u(7,C) 3 ’ 

2 hl37C-37-1 _izlu(7,C) 

= + M%0 — 

2 _„37C-37-1 -^( 7,0 

■ 3„o,c) — 


97 


the explicit expressions for the roots are 

(23) 


(37C - 37 - 1 )^ + (1 + Y + 97C1 -1 - Y “ 97C 


97 


1/3 


(24) 

(25) 

(26) 


For general values of 7 and Q, it is not straightforward to gain insights based on the variance expressions 
(211, (23 )-(261. However, in the high SNR regime with 7 +> 1, the asymptotic variance uf is characterized 


by explicit expressions and the behavior of capacity fluctuation can be understood. 


Corollary 1. In the high SNR regime 7 ^ 1 , the asymptotic variance a\ is given by {21) with cOr 
approximated by 

C> 1 


OJr 


1 

1 -C 


- v/(i-C)7 0<C<1. 
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For a fixed SNR, the variance of channel capacity is highest when the number of scattering objects equals 
to the number of antennas. 

Proof The proof of Corollary [T] is in Appendix □ 

The asymptotic variance Uj is derived with the assumption that the dimensions of matrices are large. 
However, ctj serves as a good approximation for the variance of capacity even when the matrix dimensions 
are comparable to realistic MIMO systems. In Fig. we plot the variance of channel capacity as a function 
of the number of scattering objects 5 for 4 x 4 and 8 x 8 MIMO systems. The asymptotic variance is 
calculated by Proposition at SNRs 7 ranging from —20 dB to 20 dB with a step size of 5 dB. The 
analytical calculations are compared with Monte Carlo simulations, where each curve is generated by 10® 
independent channel realizations. We also plot the asymptotic variance of a conventional Rayleigh MIMO 
channel using Eq. (13)]. Fig. shows that the asymptotic variance achieves a good agreement with 
the simulations for a wide range of SNRs and numbers of scatterings, especially in the low SNR regime. 
It is only when 7 > 10 dB that there are observable gaps between analytical and simulation curves. The 
asymptotic variance for a 8 x 8 MIMO system remains a better approximation than that of a 4 x 4 system, 
as expected. In the high SNR regime, see Fig. [^(a), there exists a peak value for the variance of capacity 
when S' > 1. As the SNR 7 increases, the peak of the variance occurs at a fixed value S = R = T 
(^ = p = 1), which is in line with our prediction in Corollary [T] This is analogous to the observations 


in 1151 that the capacity variance of the conventional Rayleigh MIMO channel is largest when R = T. 


On the other hand, the variance is monotonically decreasing in the low SNR regime, see Fig. [ 2 ](b). 
As the number of scatterers becomes large, we also observe that the capacity variance of the Rayleigh 
product channel approaches a limit. This limit is set by the variance of conventional Rayleigh MIMO 
channel with the same number of antennas. This agrees with the results in Q, where the multi-keyhole 
channel converges to a Rayleigh MIMO channel when the number of scatterers is large. 


IV. Second Order Cumulants and Cauchy Transform 

This section is devoted to the proof of Proposition which relies on the knowledge of free cumulants 
of Q. Fet us recall from (11) and ( [T^ that Tl{z) and TZ{x, y) are generating functions for the cumulant 
sequences {Kn}n>i and {Km,n}m,n>i, respectively. We will first deduce the combinatorial descriptions 
of Kn and Km,n- These results reveal that the cumulant sequences of Q have the same combinatorial 
structures as the moment sequences of a Marcenko-Pastur distribution. Namely, TZ{x, y) can be obtained 
based on known results. The notations and terminologies used in the formulation of Femma [T] and in the 
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(a) 



(b) 

Fig. 2. Variance of the Rayleigh product channel capacity. Solid line: asymptotic variance of 4 x 4 Rayleigh product channel < 21 }; 
dashed line: asymptotic variance of 8 x 8 Rayleigh product channel dotted line: asymptotic variance of Rayleigh MIMO 
channel markers: simulation, (a) 0 dB< 7 < 20 dB (b) —20 dB< 7 < —5 dB. 


proof of Proposition are given in Appendix [C| 
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Lemma 1. For integers m,n> 1, the first order free cumulant Kn of matrix Q is given by 


^n=P E 

TeiSd-„c(n) 


and the second order free cumulant rirn,n given by 


(27) 


(28) 


^m,n= 

7"GtSa-nc(771,n) 

where 5d-nc(^) ^^nd Sa-nciiTT-in) denote the set of non-crossing permutation in disc and annular sense. 

Proof The proof of Lemma is in Appendix □ 

By comparing (27i with the n-th moment fin of P ED Eq. (7.3)], we have Kn = p fin with n > 0. 
The cumulant Kn can he viewed as the scaled version of the moment fin, where the normalized trace has 
the normalization factor R instead of the actual matrix dimension S. Similarly, hy comparing ( [28] ) with 
the second order moment fim,n of P pTj Eq. (7.5)], we have the second order cumulant Km,n = fim,n, 
where no normalization is needed as in Definition [T] Note that the normalization for the first order 
moments can he arbitrarily chosen without affecting the underlying comhinatorial structures provided 
that the normalization factor grows at the same rate as the matrix dimensions. The functional relations 
between moments and cumulants as well as their generating functions ^q(-), 7^(-) are therefore preserved. 
Eor notational simplicity, it is convenient to work with the properly scaled moment sequences {fin} 
and {fim,n}- 

As fin is the n-th moment of a Marcenko-Pastur distribution with parameter (], by the moment-cumulant 
relation ||42|, the corresponding cumulant Cn equals (]. In addition, it follows from the second order 


moment-cumulant relation |26| that the moment fim,n can be expressed in terms of Cn as well as the 


corresponding second order cumulant Cm,n 

r 

fim,n = ^ ^ ' 

7rGSa..ac('m,n) i=l 




E 


'"mk.ni 


Yicm, ri' 


(29) 


TriSiSd-ncfm) 
TT2&Sd.aoin) 
|V| = | 7 riX 7 r 2 |-|-l 


i^k 


3 = 1 
3^1 


On the right hand side of (291, tt G Sa,.nc{'Ki,n) contains r > 1 orbits and the i-th orbit contains n* 


elements with ni-f- • --t-nr- = m-\-n. In the second summation, tti G 5d-nc("i) and tt 2 G 5d_nc(™) contain 
r > 1 and f > 1 orbits, respectively. The f-th orbit of vri contains mi elements with nii = m, 

and j-th orbit of 1^2 contains rij elements with ni nt = n. The partition V is composed of 

elements from the /c-th orbit of tti and the l-th orbit of 7r2 and corresponds to the second order cumulant 
Cm,k,nf Inserting Cn into (291 and comparing with (28), we notice that fim,n is entirely determined by the 
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summation over non-crossing annular permutation and (29) is only valid when the second order cumulant 
Cm,n is zero. In summary, the cumulant sequences are 


Cn = C ^nd Cm,n = 0, m,n > 1. (30) 

Apply the functional relation | [26l Eq. (52)] 

I; {xM{x)) I {yM{y)) i \ 

{xM{x) — yM{y)Y {x — yY j ' 

(31) 

where 


M{x,y) = C {xM{x),yM{y)) 


£ {xM{x)) i {yM{y)) 


M{x) 


M{y) 


fxy 


M(x) = 1 + M{x,y)= I3m,nx'^y'^, C{x,y) = ^ Cm,nx'^y'^. (32) 

n>l m,n>l m,n>l 

Due to (30 1 , the formal power series C{x,y) = 0 and xM{x) = Q-p{l/x), and ( [3T] ) becomes 

Q'p{l/x)g'jy{l/y) xy 


M{x,y) = 


(33) 


xy{gp{l/x) - gp{l/y)Y {x-yY' 

Comparing ([I^ with (32i, we obtain Tl{x,y) = M{x,y)/xy. This completes the proof of Proposition 


V. Asymptotic Capacity Distribution 

In this section, we prove a central limit theorem for the linear spectral statistics of the matrix Q = HH^ 
and show that the CDF of channel capacity I is asymptotically Gaussian as the matrix dimensions grow 
to infinity. This result generalizes the well-known CLT for the correlated Wishart matrix | [27| . Together 
with the asymptotic mean and variance of capacity calculated in Propositions [T] and the Gaussian 
convergence of capacity I gives a compact yet accurate approximation for the outage capacity. In addition, 
the approximative CDF of I is useful to analyze the diversity-multiplexing tradeoff of Rayleigh product 
channels in the finite SNR regime. 


A. Central Limit Theorem of Linear Spectral Statistics and Outage Probability 

Fet Hr[x) = R (Fq(x) — Fq{x)\, we are interested in the distribution of random variable 


— hx = 


ISa 


p{x)dHR{x). 


Using the integral identities (12i and (pA]), we can rewrite (34) as 


- fix = — 0 ip{z)GR{z)dz, 


(34) 


(35) 


where C is the closed contour selected as in fn). In the following proposition, we prove that Gr{z\) 
and Gr{z 2 ) with zi,Z 2 € C are jointly Gaussian distributed in the asymptotic regime (|^. 
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Proposition 4. In the asymptotic regime Q, {Gii{z)}z&c forms a tight sequence (see, e.g. on a 

closed contour C enclosing the support of Fq{-), and Gr{z) converges weakly to a Gaussian process 
on the complex plane. 

Proof The proof of Proposition is in Appendix □ 

By Proposition]^ the asymptotic Gaussianity of (351 follows from the fact that the Riemann sum 
corresponding to this integral has jointly Gaussian summands and the sum of which can only he Gaussian. 
Proposition 1^ generalizes the CLT of LSS for Wishart type random matrices involving one deterministic 


correlation matrix and one random matrix with i.i.d. entries | [27| . When both matrices 'J' and 0 are 
random and independent, Gr{z) can he decomposed into two random processes, see ( | 6 ^ in Appendix]^ 
Both random processes are asymptotically Gaussian and each is governed hy Lemma As already 


discussed in Section III-B the induced fluctuation of LSS is characterized hy both the first order Cauchy 
transform and the second order /^-transform. This is different from the Wishart random matrices, where 
the corresponding Gr{z) only involves one asymptotic Gaussian process and the fluctuation of LSS 
is solely determined by the first order Cauchy transform. This makes the CLT of Rayleigh product 
ensembles distinct from the one in |27|. Together with the mean px and variance ctj in Propositions]^ 
and ]^ an analytical Gaussian approximation to the capacity distribution of the Rayleigh product channel 


is available. This result can not be directly derived based on the existing results in 114|-|18|. Note that 
the CLT of LSS for the biorthogonal ensembles, such as the Rayleigh product ensemble, was proved by 


Breuer and Duits [281 for polynomial functions p{x). However, it is not clear how to extend this result to 
generic analytic functions ip{x) such as the channel capacity ip{x) = log(l +yx). Recently, the CLT for 


the product of two real and square random matrices was proved by Gotze, Naumov, and Tikhomirov |431 
for smooth function p{x). 

Let erf(x) = e~*^dt denote the error function, the Gaussian approximation to the CDF of 

channel capacity Z is 

+ (36) 

and thus the outage capacity is 

Xout ~ FI + erf-i(2Pout - 1), Pout G (0,1). (37) 


Based on (36) and (37), the outage behavior of the Rayleigh product channel can be understood. In 


Fig. ]^ the impact of the number of scatterers S as well as the received SNR 7 is studied, where a 
4x4 MIMO system is considered in the presence of S' = 2,4, 8 , and 32 scattering objects. We plot the 
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Fig. 3. CDF of channel capacity in the presence of 4 x 4 Rayleigh product channel. Solid line: Gaussian approximation l |36| >; 
markers: simulations; dashed line: CDF of 4 x 4 MIMO capacity with independent Rayleigh fading. 


approximative outage probability as well as the empirical one obtained by Monte Carlo simulations. The 
outage probabilities are evaluated at SNRs 7 = 3 and 7 = 10 dB. As a comparison, we also plot the 
outage probability of a 4 x 4 Rayleigh MIMO channel with independent fading entries. As the number of 
scatterers S increases, the outage capacity at a given probability level rapidly increases until S is equal 
to the number of antennas, which is especially visible when SNR is large. In this range, the rank of the 
channel matrix is limited by the number of scatterers and increasing the scatterers effectively improves 
the rank of channel matrix. When S' > 4, the matrix rank is limited by the number of antennas and 
the improvement of outage capacity is relatively slow. Yet, the outage probability curve approaches to a 
limit, which corresponds to the outage probability of a conventional Rayleigh MIMO channel as predicted 

by 0. 

In Fig. 1^ we examine the impact of number of antennas on the outage capacity. We plot the approx¬ 


imative 1% outage capacity (37 1 as a function of received SNR 7. Assume the number of transmit and 
receive antennas T = i? = 2,4, 8, and 16 while fixing the number of scatterers S = 8. As expected, it is 
seen that the outage capacity of the Rayleigh product channel is lower than the conventional Rayleigh 
MIMO channel due to the presence of a finite number of scatterers. In the high SNR regime, the outage 
capacity curves of both channels attain the same slope when T < S, which suggests that the capacity 
scales at the same rate as the limiting Rayleigh MIMO channel. On the other hand, when S < T there is 
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Fig. 4. 1% outage capacity of Rayleigh product channels with S = 8 scattering objects and equal numbers of antennas. Solid 
line: Gaussian quantile approximation l |37| l; markers: simulations; dashed line: outage capacity of conventional Rayleigh MIMO 
channels with equal number of antennas. 


an increasing gap between the two channels as 7 increases. Finally, it is observed from Fig. and that 


the Gaussian approximation (36 1 and (371 is reasonably aeeurate for a wide range of parameter settings. 


B. Finite-SNR Diversity-Multiplexing Tradeoff 


The eoneept of DMT was originally proposed in 144| to charaeterize the diversity gain, which is related 
to link reliability, and the multiplexing gain, whieh is related to speetral effieieney. The DMT indieates that 
both types of performanee gains ean be obtained simultaneously while satisfying a fundamental tradeoff. 
The operational interpretation of the DMT framework is via the existence of universal eodes, which are 
tradeoff optimal in the high SNR regime ||43|. In spaee-time eode design |46|, DMT represents a useful 


analytieal tool to eharaeterize the asymptotie performanee of eodes. However, the asymptotie tradeoff is a 
too optimistic upper bound to estimate the operational performance at realistic SNRs. Recent works have 
shown that eodes optimized at high SNR may not be optimal at low or moderate SNR. Motivated by these 
faets, Narasimhan | |47| proposed a finite-SNR DMT framework, whieh eharaeterizes the non-asymptotie 
DMT. There, he studied the finite DMT for the correlated Rayleigh and Rician MIMO channels at realistie 
SNR levels. 

Under the assumptions of slow fading and eapaeity aehieving eodes with rate r, the multiplexing gain 
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m of a MIMO channel is defined according to | |4^ Eq. (21)] as 

n 

m = —r, 

where n = min(i?, S, T). The multiplexing gain provides an indication of the sensitivity of rate adaptation 
strategy as the SNR changes. When the applied codes have a higher multiplexing gain, the rate adaptation 
tends to respond more dramatically to the SNR variations. At a fixed multiplexing gain, the finite-SNR 
diversity gain d{m,^) is defined as the negative slope of the log-log plot of outage prohahility Pout(x) 
at rate r = m fix In versus SNR 7 , 


d(m,7) = - 


9 log Pout {mfix/n) 


(38) 


9 log 7 

At a particular SNR 7 and multiplexing gain m, the diversity gain ( [3^ provides an estimate of the 
additional SNR needed to reduce the outage prohahility hy a certain amount. Using the derived outage 
prohahility (|36|), the finite-SNR DMT can be obtained for the Rayleigh product channel. 


Proposition 5. When R = T, the finite-SNR DMT of Rayleigh product channels can be approximated by 

27 exp(—iT(m, 7 )^/ 2 ) dK{m,y) 


d{m,y) = 


(39) 


-y/vr 1 -|-erf(—iT(m, 7 )) dy 
where ^ with px nnd cij calculated by tmr/ 

Proof. The proof of Proposition follows by substituting (36 1 into (38). □ 

Note that the approximation (39 1 is tight in the asymptotic regime (|^. This is because the approximation 


error is induced from (|36l. 

Fig. shows the finite-SNR DMT of a 2 x 2 Rayleigh product channel with S = 2 scatterers. The 
approximated tradeoff curves are generated by ( [39] ) at SNRs 7 = 0 dB and 7 = 5 dB. Compared to 
the Monte Carlo simulations, the proposed approximation yield close estimate for the MIMO diversity 
gain. As m approaches the maximum multiplexing gain, the discrepancies between the approximation 
and simulation curves decrease. When R = T = A antennas are used, the MIMO channel achieves 
improved channel diversity for a given multiplexing gain as shown in Fig. In both figures, we have 
also plotted the asymptotic DMT of Rayleigh product channels according to |[^ Eq. ( 8 )], when SNR 7 
approaches infinity. It is clear that the asymptotic results significantly overestimate the channel diversity at 
the considered operational SNR levels, which justifies the usefulness of the proposed approximation 
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Fig. 5. Finite-SNR DMT of 2 X 2 Rayleigh product channel with 5 = 2. Solid line: approximation calculated by ( [^ ; markers: 
simulations; dashed line: asymptotic DMT with SNR 7 —>■ 00 . 



Fig. 6 . Finite-SNR DMT of 4 X 4 Rayleigh product channel with 5 = 2. Solid line: approximation calculated by ( [^ ; markers: 
simulations; dashed line: asymptotic DMT with SNR 7 — >■ (xj. 
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VI. Conclusions 

We studied the outage probability of Rayleigh product channels, which explicitly model the rank 
deficiency effect. Using free probability theory, the asymptotic variance of channel capacity is calculated 
for large channel matrix and becomes exact when matrix dimensions approach infinity. Compared to 
the conventional Rayleigh MIMO channels, the Rayleigh product channels induce a higher capacity 
fluctuation, which is determined by the second order i?-transform of the channel matrix. We have proved 
that the channel capacity is asymptotically Gaussian by establishing a CUT of a relevant linear spectral 
statistics. Numerical results show that the proposed Gaussian approximation is reasonably accurate for 
realistic channel dimensions. Results have been utilized to characterize the tradeoff between diversity 
and multiplexing of Rayleigh product channels, while the asymptotic tradeoff for large SNR may be an 
over-optimistic estimate. 
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Appendix A 

Proof of Proposition [3] 


Let Sq G M denote the complement of the support Sq on the real axis. It is shown in 149| that for a 
given open interval T C 5q, the function ^q(-) is continuous, real, and decreasing. This is also true for 
^p(-) with T CZ tSp. Therefore, there exists 3 ,n inverse function continuous, 

real, and decreasing over {t eR : t = G{x)^x E We choose the contour Cx to be inside of Cy such 
that they both cross real-axis in the intervals (— 1 / 7 , 0 ) and (A^, 00 ), where A,, denotes the right end-point 


of the support Sq. By substitutions ti = G{x) and t 2 = G{y), the integral (20 1 can be alternatively 
integrated over contours Ci and C 2 as 


cj| = 


1 




Cl A 


(U - ^2)^ 


dtidt2 = ^ Z^2)) ^i: 


where 


/Ci, 


r(/2) - 


1 I log(l-f7(3' ^{t)) 


2tti {t — f2)^ 


df. 


r (f 2 )df 2 , (40) 


(41) 
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The transformed contours Ci and C 2 cross the real-axis in the intervals (^(O ), G(— 1 / 7 )) = (— 00 , G(— 1 / 7 )) 
and {G{oo),G{Xr)) = (0, G(Ar)), where 


G(0 ) = lim G{x), G{oo) = lim G{x). 

x—^O— x^oo 


The inverse function Gp^{t) is calculated via (191 as 




We obtain the inverse function hy solving the quadratic equation (| 8 |) in z as 


= 


’Q 


1 - (1 - c)f ± v/i + (i-C)2t2-2(i + c)f 

2 Cf 2 


(42) 


(43) 


where the minus sign is taken hy the requirement lim zGq{z) = 1 |49|. Substituting (42i and (431 

Z^OO _ _ _ 

into ([ 44 ]) and applying integration by parts, we can rewrite /Cinner(f 2 ) as 


/Ci, 


- 77 — 


7(G-'(/))' 


2vrz {t-t2)il + 'yG ^{t)) 


df 


7 


2(C- l)f2 + 3f- 1 


df, 


(44) 


ItTI t{t — l){t — t2){t — UJr){t — 0J+){t — UJ-) 

where Ur, and are the three roots of the cubic equation — 2t^ + (1 — 7 + + 7 = 0 and a; 


denotes the real solution such that = G(— 1 / 7 ) < 0. The integrand of (44) has two simple poles at 


f = 0 and f = Wr in Cl, and by applying the residue theorem, the integral /dinner(f 2 ) becomes 


/Ci 


/ N 1 

r(/ 2 ) - T- 


1 


t2 t2 — OJr 

Substituting ( |43| ) into ( [40) ), the variance erf can be therefore expressed as 

= 2^ £ 4 +70-‘(*)) (1 - r=7:) d* 


(45) 




(f — w+)(f — U-) f 1 


(f - 1)2 


t t — UJr 




log 


t — UJr 


t t — UJr 


dC (46) 


The second integral in (46) has an anti-derivative (log^=^)^/2, which is single-valued over C 2 and 
therefore vanishes due to Cauchy’s theorem. Applying the residue theorem to the first integral in ( |4^ , 
we obtain 

O , (uJr — Vj^UJ+UJ- 
fJj = log 7 -rn- 

[UOr — UJ+)[UJr — UJ-) 

The proof is completed by the fact that uJrUJ^uj- = — 7 . 
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Appendix B 

Proof of Corollary[T] 

When C > C) 11 ( 7 , Cy can be expanded at 7 —)■ 00 as 

u(-f, c) = + 0 ( 7 -‘), 

u( 7 . 0 " = 3 (C - 1)7 - + 0 ( 1 ). 


The real and negative solution ujr of 

2 37C-37-1-1/(7, 0 " _ 1 I 


OJr = - — 


corresponds to ti and it follows from (231 that 

3 3n(7,C) 1-C 

When C < the asymptotic expansion of 11 ( 7 , C) at 7 —)• cx) yields 

uh, C) = (-1)^/®V3(1 - + 0(1) = ef ^3(1 - C)7^^^ + 0(1), 


(47) 


(48) 


where we took the principle value (—l)t /6 = In this case, the real and negative solution ojr of (22i 
corresponds to t 2 - Inserting (|4^ into (|24|), we have 


to, 


= 7 “ J\/3(1-C)7 (e" + e ^ - ^(1 - C)7- 


3 3 V 7 77 , V 73 

Inserting ( [47| ) into ( [2T] ) and taking derivative with respect to (, we obtain 

d 2 ((C - 1)^7 - 2 c 2 + C) _ 2 


(49) 


dc 




C(C- 1 )((C-1)37 + 20 C(C-I) 


< 0 . 


Therefore, when C > 1 the asymptotic variance is a monotonically decreasing function of (. When 
(;^ < 1 , the variance is a monotonically increasing function of 0 where the derivative da^jd^ > 0 
with LJr given by ( |49] ). To sum up, the asymptotic variance is maximum when ( approach 1 from both 
sides of the axis. This completes the proof of Corollary [1] 

Appendix C 

Non-Crossing Permutations 

Let us introduce the main combinatorial objects, the non-crossing disc and annular permutations, and 
the related notations, which are used in Lemma [T] and in the proof of Proposition We refer the readers 


to 1251, |41|, 1421 for a comprehensive description of the non-crossing permutations. 

For a positive integer n, we denote the set {1 ,..., n} as [n] . Let Vn denote the set of all partitions 
of [n]. Given a partition vr G Vn, we have tt = {Bi ,..., B^}, where Bi,..., Bk, called blocks of tt, 
are non-empty disjoint subset of [re], i.e. Bi U ■ ■ ■ U Bj^ = [re], and Bi n Bj = (/} for i ^ j. Given two 
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partitions 'Ki,'K 2 £ T’n, we have tti < 1^2 if and only if every block of tti is contained in a block of 112 
and denote 1„ = {1,... ,n} the largest partition over [n]. We say a partition tt is non-crossing in disc 
sense if there does not exist 1 < i,j < k, i ^ j, and l<a<6<c<d<n, such that a,c £ Bi and 
b,d £ Bj. A non-crossing disc partition tt £ Vn can be visualized as follows: draw the points 1,... ,n 
clockwise around the boundary of a disc and connect the points belonging to the same block with a 
convex hull. The partition vr is non-crossing if the convex hulls are pairwise disjoint. 

A concept closely related to the partition is the set permutation. Let Sn denote the set of all permutations 
over [n]. Given a permutation t £ Sn, have t = Ai - ■ ■ such that [n] is decomposed into k orbits 
and Ai = ( 0 ,( 1 ),..., Oi(s)) is the i-th orbit of r containing s elements. For two elements ai{p),ai{q) 
belong to the same orbit Ai, there exists an integer m > 1 such that T'^{ai{p)) = ai{q). For instance, if 
r = (1,4, 5)(2,3) G ^ 5 , it maps the elements as r(l) = 4, r(4) = 5, r(5) = 1, r(2) = 3, and r(3) = 2. 
The notion #(t) is used as the number of orbits of r. We say a permutation is standard in disc sense if 
for every orbit A* = (aj(l),..., ai{s)) of r, there is 0 ^( 1 ) < • • • < aj(s). A standard disc permutation r 
has an induced partition tt, where each block of vr contains the same elements as the corresponding orbit 
of r. In addition, if the partition induced by standard permutation r is non-crossing in disc sense, r 
is a non-crossing disc permutation and we denote the set of all non-crossing disc permutation on [n] 
as 5 d-nc(? 2 ). Let r/ = (1,... , n) the forward cyclic permutation of [n]. A permutation t £ Sn satisfy the 
so-called geodesic condition as 

#{t)+ #{t~^v) <n + l, ( 50 ) 


where T~^r] or alternative o t] is the composite permutation by first applying t] and then r”^. The 
equality in ( |50| ) only holds when r is non-crossing disc permutation. The geodesic condition can be 
intuitively viewed as the triangular inequality for the Cayley graph of permutation group Sn Ism . Let 
Ti, r2 £ Sn, the distance between ri and T 2 in Cayley graph of Sn amounts to d(ri, r2) = n — #(tj“^T2). 


The inequality (501 can be rewritten in terms of Cayley distance as d{id,T) -|- d{T,r]) > d{id,ri), where 
id is the identity permutation. The condition that permutation r is non-crossing is equivalent that r lies 
on the geodesic connecting id and ry in the Cayley graph. 

Let us consider another set of permutations Sm+n, illustrated via topological drawing in the (m, n)- 
annular sense. Instead of placing m + n points on the boundary of one disc, we will use two concentric 
circles. The points 1,.. ., m are placed clockwise on the external circle and the points m -|- 1,. .., m -|- n 
are placed counter-clockwise on the internal circle. The annulus between the two circles are referred to 
as (m, n)-annulus. Given a permutation r £ Sm+n, it is visualized by drawing curves within the (m, n)- 
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annulus, which connect the elements of each orhit, respectively. Let A = (a(l),... ,a(s)) an orhit of 
r with s elements. The corresponding curve connects a(l) to a(2), then o(2) to a(3), ..., then a(s) 
to a(l) such that: 1) it does not intersect with itself; 2) it encloses a region completely contained in 
(m, n)-annulus; 3) it goes clockwise around the region. We say a permutation r is (m, n)-connected if 
there is at least one orhit of r contains elements on both circles, otherwise r is (m, n)-disconnected. In 
addition, r G Sm+n is standard in (m, n)-annular sense if each orhit of r satisfies either of the following 
conditions: 

1) Given an orhit A C r such that A C {1,..., m} or A C {m + 1,... ,m + n}. The elements of A, 
upon cyclic permutations, can he sorted in increasing order. 

2) An{l,..., m} / 0 and A(~^{m + 1 ,..., m-|-n} / 0. We have A = (a(l),..., a(A:), 6(1),..., 6(f)), 
where a(l),... ,a(A:) G {1,... ,m} and 6(1),... ,h{l) G {m-|-l,... ,m + n}. Both sequences {a(i)} 
and {6 (y)}, upon cyclic permutations, can he sorted in increasing order, respectively. 

We say a permutation r G Sm+n is non-crossing in (m, n)-annular sense if r is standard and the regions 
enclosed hy every orhits of r are not overlapping in the annular visualization descrihed above. We 
denote 5a-nc(^, ^) as the set of non-crossing (m, n)-annular permutations. Finally, according to [ |^ Th. 
6.1], a permutation r G Sm+n and (m, n)-connected satisfies a geodesic condition in fhe (m, n)-annular 
sense as 

#(r)-h #(r“\o) < m-h n, (51) 

where 770 = (1,..., m)(m -|- 1,..., m -|- n) and fhe equality only holds when r G Sa-nc{m, n). 

Appendix D 
Proof of Lemma[T] 

The proof relies on a known combinatorial identity of the moments of Gaussian random variables, 
which is stated below. 

Lemma 2. (Wick’s Lemma Let Zi,... ,Zt denote i.i.d. complex Gaussian random variables with 

zero mean and unit variance. 

1) Let m, n be positive integers such that m ^ n, and consider two functions a : [m\ -+ [f] and 
/3 : [n] —>■ [t]. Then 

[^a{l) ■ ■ ■ ^a(m)-2’/3(l) ' ' ' ■^/3(n)] = 0- 


(52) 


2) Let n be a positive integer and consider two functions a,P : [n] -+ [f]. Then 

IE [Za{i) ■ ■ ■ Za{n)Z^{i) ''' -^/3(n)] = Card {t G Sn\a = /3 O r} 
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where card{-) denotes the cardinality. 


Denote Q = with the entry qij given hy 

^ S T S 

Qij ~ ^ ^ ^ ^ ^ ^ '^ai(^ab^cb'4’cj- (53) 

a=l b=l c=l 

In light of 126 Th. 2.12], the second order free cumulant of Q can he expressed in terms of classic joint 
cumulants of entries q^j as 

ttm,n — lim R kra+n ; (54) 

R-^oo 

where the vector ^rn^n \Q.i(l)i( 2 )') ? ^i(m)i(l )5 ^ 2 ( 771 + 1 ) 2 ( 771 + 2)5 ^ 2 ( 777 + 2 ) 2 ( 771 + 3)5 ■ • • 5 ^ 2 ( 771 + 77 ) 2 ( 771 + 1 )] 

can he any distinct choice of i(l),..., i(m+n). For a partition vr G Vn, we define TTj = {7rj(l),..., 7rj(s)} C 
TT as a block of tt with s elements. The expectation over the blocks of partition vr is defined as 


IE7r[fll, • • • ) On] 11 ^ ’ ’ ’ ®7ri(^)] ’ 

TTiCTT 

Using the cumulant-moment relations [26 Eq. (10)], km+n can be written as a sum of E 7 r[qm,n] for all 


vr G 'Pm+n, namely 


km+niRhn,n) — ^ ^ [qrn,n] Mob-p^^^ (vT, lm_|_jj), 

TT&Vm + n 


(55) 


where : Vm+n x Vm+n C denotes the Mobius function |42| on Vm+n, which satisfies 

1 if TT = lj7^_j_7T, 


M6bp„+„ (t/, 1 


771+77, 


= < 


V^Pm + r. 

■K<ri 


(56) 


0 otherwise. 


Inserting ( |53| ) into (551 and applying Lemma we see that for a given partition vr the multiplica¬ 
tive moment E.n-[qm,n] is non-zero only when the partition vr takes the forms vr(^i = lm+n> vr(^i = 
{{1,...,m}, {m + 1,... ,m + n}}. The corresponding Mobius function can be calculated via (561 as 


Mobp,, 


(^vr(2\ Im+n) 


= - 1 . 


(57) 


It 


follows from (|^ and (|^ that the cumulant A;m+n(qm,n) equals fem+n(qm,n) = E^(i)[qm,n] 


Ejr( 2 ) [qm,n]- 

Denote the permutation rjo = (1,. •. ,m){m + I, ... ,m + n) G Sm+n- Substituting qi(t)i{rio(t))^ I = 
... ,m + n, into (531, we can express E^(i)[qm,n] as 

. t=i 


{RS) 


l<ai,...,aTn + n^5' 
l<bi,...,6^+^<T 


'771+71 

Ctbt'4^Cti{r)Q{t)) 


(58) 
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It is convenient to introduce the functions A \ [m + n] ^ [S], B ■. [m + n] ^ \T], C ■. [m + n] ^ [S]. 
Due to the independence between and O^i, we can rewrite (58 1 as 

'm+n m+n 


1 


(RS) 


m+n 


E 

A,B,C 


n 


n ^C{t)B{t) 
t=l t=l 


E 


m+n 


m+n 


n n 


t=l 


t=l 


(59) 


Since the indexes i(l),... ,i(m + n) are distinct, hy Lemma[^the second expectation in (|5^ is non-zero 


only when CorjQ^ = A. The first expectation in (591 is calculated hy (52) with a{t) = {A{t),B{t)) and 


/3(t) = {C{t), B{t)). For a given permutation r G Sm+n, the summands of (591 should fulfill A = C or 
and B = B o T to he able fo contribute to the summation. To summerize, lE^rCi) [qm,n] is expressed as 

1 


IEttU) [qm,n] — 


{RS) 


m+n 


J]]card{r G \C o{t ^r]o) = C,B OT = B] 


(60) 


B,C 


Interchange summation and cardinality operations in (601 and write (601 as a sum over the permutation r, 

1 


IEti-Ci) [qm,n] — 


[RS] 


m+n 


card{(5,C) I C7o (r-+) = (^,Sor = 5} . 


(61) 


rGtSrr 


The condition C o (r^^ryo) = C* is equivalent to requiring C to be constant on the orbits of r^^ryo- 
For a given permutation r“^?yo, there are ways to choose indexes C. Similarly, the condition 

B = B o T is equivalent to requiring B to be constant on the orbits of r and there are ways to 

choose indexes B. As a result, ( |6T] ) equals 

1 


IEttCi) [qm,n] — 


(RS) 


m+n 




(62) 


r£Srr 


Following the same procedures as in (58l-(62), we obtain E ( 2 )[qm,n] and E ( 2 )[qm,n] as 


E^(2) [qm,n] — 


(RS)^ 


E 


.( 2 ) 


(RS)^ 


Y^ ^^i)j'#(d)^ 

(63) 

TigiSm 


Y^ 5'#('r2"S2)2^#('r2)^ 

(64) 


T2SiS„ 


where the permutations ryi = (1,... ,m) G Sm and 7y2 = (1,...,n) G We multiply (63) with (64i 


and combine the permutations ti and T 2 to form a new permutation r = ti o r 2 G Sm+n, where +2 is 
homogeneous to T 2 with f-th element relabeled as m + i. Note that vr,- < tt^q, where partitions tt,- and -Kna 
are induced by r and ryo, respectively. The new permutation r is therefore (m, n)-disconnected, namely 


E^(2) [qm,n] [qm,,n] — 


1 


{RS) 


m+n 


E 




T&Sm+n, 

{m,n) -disconnected 
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Inserting km+n{<lm,n) = IE^(i)[qm„n] -E^{ 2 )[qm,n] into Q, we obtain 

1 


= lim 




^»?o)+#('r) 


rGSm + n, 
(m,n) -connected 



Aceording to ( [sT] ), the exponent #(r) + #(t“^77o) < m + n for r G Sm+n and is (m, n)-connected. 
In addition, the equality holds only when r is non-crossing in the (m, n)-annular sense. Let 5 —)■ cx), 
all terms in the summation with crossing permutation vanish and for r G Sa.nc{'m,n), 
cancels with The derivation for the first order cumulants follows similarly. 


Appendix E 

Proof of Proposition |4] 

Denote Qq{z) = 0Qiz) as the expected resolvent of Q, which is averaged over the ensembles 
of As matrix P is random, Qq is also a random variable and is the solution of Q with Ep(-) replaced 
by its empirical version Ep(•)^ namely 


z = ^+P 

yq, 


AdFp(A) 
1 - \Qq 


(65) 


We divide Gr{z) into two parts as 


Gr{z) = R (gci{z) - Gq{z)) + R (g^{z) - Gc^iz)) = G\{z) + G|(z). (66) 

The proof of asymptotic Gaussianity of Gr{z) follows in two steps, showing the asymptotic Gaussianity of 
G\^{z) and G\{z), respectively. The proof then boils down to a direct application of Bai and Silverstein’s 
lemma [ZTl Lemma 1.1]: 


Lemma 3. (CUT of Wishart type ensembles Consider an N x N Hermitian matrix B = X^^TX/A^ 
and assume: 

1) 'K is an n X N complex random matrix with i.i.d. entries, E[Xjj] = 0, E[|Xjjp] = 1, and 
E[|Xi,,f ] = 2; 

2) T is a non-random Hermitian nonnegative definite matrix and its ESD Tt(‘) converges weakly to 
a non-random limiting distribution F^f). 

Let Mn{z) = n (Gb{z) — and Cz a positive contour enclosing the support ofB, then the sequence 

{Mn{z)} is tight on the contour Cz, and Mn{z) converges weakly to a Gaussian process on the complex 
plane with E [Mn{z)] = 0 and Cov{Mn{zi), Mn{z 2 )) given by 0 
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Conditioned on P, it is straightforward to verify that the complex Gaussian matrix fulfills the 
assumption 1). The BSD Tp(-) converges to the Marcenko-Pastur distribution and therefore fulfills the 


assumption 2). Furthermore, Qq is, hy definition, the average of Qq. It thus follows from Lemmathat 
G\^{z) given P converges to a Gaussian process on the complex plane with E[Gj^(z)] = 0 and 

^q(^i)^q(^2) 1 


Cov(G)^(zi),G)^(z2)) = 




By (65 1 , we have 


Subtracting (|7]) from ( | 68 ] ) yields 
Qq — Qq, 


z = -^ + p 

yq 


AdFp(A) _ AdPp(A) 

1 — A^q 1 — \Qq 


+ P 


AdFp(A) 

1 — A^q 


(67) 


( 68 ) 


0 = 


^q) A2dFp(A) r / xdFp{X) AdFp(A)\ 


Qq-Sq = 


GqGq 

p GqGq 

c 


J (1 — A^q)(1 — A^q) 

/Ad^(A) AdFp(A)\ 

1 - A^q " 1 - A^q J ’ 


y1 — A^q 1 — A^q J 


(69) 


where C = 1- p GqGq f 


AMFp(A) 


(l-A^)(l-AgQ) 


. By definition of Cauchy transform, we have 


AdFp(A) 


1 


1 


1 - ^Gq Gq 


= -7^ + tttGp zr- 


^q 


1 


Gq 


and 


AdFp(A) 


1 — A^q Gq 


= - 7 ^ + TTtGp 7 ^ 


^q 


Gq 


(70) 


By inserting (70l into (69l and multiplying R on both sides of (69l, we obtain = 5 '^^p(1/^q) 


^p(l/^q) j /C. In the asymptotic regime ( 61 , Gq converges to Gq and C converges to 1 —f 


AMFp(A) 

_ (1-APq)^- 

It follows from Lemma (with an 5 x S' matrix P = X^X and T being an identity matrix) that 
converges to a centered Gaussian process. Note that the covariance (67 1 of Gj^(z) is independent of 
P and the randomness of Gj^(z) only comes from P, which makes G\^{z) and G^( 2 ;) independent of 
each other. Combining the above arguments, Gr(z) = G\^{z) + G\{z) is asymptotically a sum of two 
independent Gaussian processes and therefore Gr(z) is also a Gaussian process. 
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